AMF: Aggregated Mondrian Forests for Online Learning

نویسندگان

چکیده

Abstract Random forest (RF) is one of the algorithms choice in many supervised learning applications, be it classification or regression. The appeal such tree-ensemble methods comes from a combination several characteristics: remarkable accuracy variety tasks, small number parameters to tune, robustness with respect features scaling, reasonable computational cost for training and prediction, their suitability high-dimensional settings. most commonly used RF variants, however, are ‘offline’ algorithms, which require availability whole dataset at once. In this paper, we introduce AMF, an online algorithm based on Mondrian Forests. Using variant context tree weighting algorithm, show that possible efficiently perform exact aggregation over all prunings trees; particular, enables obtain truly parameter-free competitive optimal pruning tree, thus adaptive unknown regularity regression function. Numerical experiments AMF strong baselines large datasets multi-class classification.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mondrian Forests: Efficient Online Random Forests

Ensembles of randomized decision trees, usually referred to as random forests, are widely used for classification and regression tasks in machine learning and statistics. Random forests achieve competitive predictive performance and are computationally efficient to train and test, making them excellent candidates for real-world prediction tasks. The most popular random forest variants (such as ...

متن کامل

Universal consistency and minimax rates for online Mondrian Forests

We establish the consistency of an algorithm of Mondrian Forest [LRT14, LRT16], a randomized classification algorithm that can be implemented online. First, we amend the original Mondrian Forest algorithm proposed in [LRT14], that considers a fixed lifetime parameter. Indeed, the fact that this parameter is fixed actually hinders statistical consistency of the original procedure. Our modified M...

متن کامل

Mondrian Forests for Large-Scale Regression when Uncertainty Matters

Many real-world regression problems demand a measure of the uncertainty associated with each prediction. Standard decision forests deliver efficient state-of-the-art predictive performance, but high-quality uncertainty estimates are lacking. Gaussian processes (GPs) deliver uncertainty estimates, but scaling GPs to large-scale data sets comes at the cost of approximating the uncertainty estimat...

متن کامل

Aggregated Recommendation through Random Forests

Aggregated recommendation refers to the process of suggesting one kind of items to a group of users. Compared to user-oriented or item-oriented approaches, it is more general and, therefore, more appropriate for cold-start recommendation. In this paper, we propose a random forest approach to create aggregated recommender systems. The approach is used to predict the rating of a group of users to...

متن کامل

The Mondrian Process for Machine Learning

This report is concerned with the Mondrian process [1] and its applications in machine learning. The Mondrian process is a guillotine-partition-valued stochastic process that possesses an elegant self-consistency property. The first part of the report uses simple concepts from applied probability to define the Mondrian process and explore its properties. The Mondrian process has been used as th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of The Royal Statistical Society Series B-statistical Methodology

سال: 2021

ISSN: ['1467-9868', '1369-7412']

DOI: https://doi.org/10.1111/rssb.12425